Making Complex Prediction Rules Applicable for Readers: Current Practice in Random Forest Literature and Recommendations
نویسندگان
چکیده
Ideally, prediction rules (including classifiers as a special case) should be published in such a way that readers may apply them, for example to make predictions for their own data. While this is straightforward for simple prediction rules, such as those based on the logistic regression model, this is much more difficult for complex prediction rules derived by machine learning tools. We conducted a survey of articles reporting prediction rules that were constructed using the random forest algorithm and published in PLOS ONE in 2014-2015 with the aim to identify issues related to their applicability. The presented prediction rules were applicable in only 2 of 30 identified papers, while for further 8 prediction rules it was possible to obtain the necessary information by contacting the authors. Various problems, such as non-response of the authors, hampered the applicability of prediction rules in the other cases. Based on our experiences from the survey, we formulate a set of recommendations for authors publishing complex prediction rules to ensure their applicability for readers.
منابع مشابه
Predicting the Next State of Traffic by Data Mining Classification Techniques
Traffic prediction systems can play an essential role in intelligent transportation systems (ITS). Prediction and patterns comprehensibility of traffic characteristic parameters such as average speed, flow, and travel time could be beneficiary both in advanced traveler information systems (ATIS) and in ITS traffic control systems. However, due to their complex nonlinear patterns, these systems ...
متن کاملPrediction of maximum surface settlement caused by earth pressure balance shield tunneling using random forest
Due to urbanization and population increase, need for metro tunnels, has been considerably increased in urban areas. Estimating the surface settlement caused by tunnel excavation is an important task especially where the tunnels are excavated in urban areas or beneath important structures. Many models have been established for this purpose by extracting the relationship between the settlement a...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملPredicting stock prices on the Tehran Stock Exchange by a new hybridization of Fuzzy Inference System and Fuzzy Imperialist Competitive Algorithm
Investing on the stock exchange, as one of the financial resources, has always been a favorite among many investors. Today, one of the areas, where the prediction is its particular importance issue, is financial area, especially stock exchanges. The main objective of the markets is the future trend prices prediction in order to adopt a suitable strategy for buying or selling. In general, an inv...
متن کاملCosts of treatment after renal transplantation: is it worth to pay more?
Objectives: The present study aimed to provide an estimation of the current financial burden of renal transplantation therapy for insurance organisations.Methods: An Excel-based model was developed to determine the treatment costs of current clinical practice in renal transplantation therapy (RTT). Inputs were derived from Ministry of Health and insurance organizations` database, hospital and p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016